Quantitative Methods for the Prioritization of Foods Implicated in the Transmission of Hepatititis E to Humans in Italy

Hepatitis E is considered an emerging foodborne disease in Europe. Several types of foods are implicated in the transmission of the hepatitis E virus (HEV) to humans, in particular, pork and wild boar products. We developed a parametric stochastic model to estimate the risk of foodborne exposure to HEV in the Italian population and to rank the relevance of pork products with and without liver (PL and PNL, respectively), leafy vegetables, shellfish and raw milk in HEV transmission. Original data on HEV prevalence in different foods were obtained from a recent sampling study conducted in Italy at the retail level. Other data were obtained by publicly available sources and published literature. The model output indicated that the consumption of PNL was associated with the highest number of HEV infections in the population. However, the sensitivity analysis showed that slight variations in the consumption of PL led to an increase in the number of HEV infections much higher than PNL, suggesting that PL at an individual level are the top risky food. Uncertainty analysis underlined that further characterization of the pork products preparation and better assessment of consumption data at a regional level is critical information for fine-tuning the most risky implicated food items in Italy.


Introduction
Hepatitis E virus (HEV) is the causative agent of hepatitis E, an emerging disease of worldwide occurrence affecting humans. The clinical course of hepatitis E is frequently asymptomatic. Clinical signs include fever, anorexia and jaundice. Extrahepatic manifestations and serious sequelae, including chronic condition leading to liver failure and death, may occur especially in immunocompromised patients and in the presence of comorbidity [1][2][3][4][5]. A major risk for chronic and fulminant hepatitis E is reported for pregnant women, with the possibility of abortion or infant mortality [6].
For many decades, in Europe and in the USA, hepatitis E was considered a health problem limited to travelers coming back from areas where hepatitis E was endemic [7,8]. However, since the early 1990s, autochthonous cases have been increasingly reported [9]. In Europe and other high-income countries, hepatitis E is considered a foodborne zoonosis causing mainly sporadic cases [7,8,10]. Outbreaks of HEV infection were also occasionally reported in the European Union (EU) [11][12][13]. In the EU, however, surveillance of HEV infection is sparse and not harmonized, hampering the possibility to adequately characterize the epidemiology of hepatitis E, including the accurate identification of the food items implicated in the transmission of HEV to humans. HEV genotypes affecting humans mainly belong to HEV3 and HEV4. These genotypes are frequently detected in pigs and wild boars, which are considered the main HEV animal reservoirs [3]. The majority of human cases are attributable to consumption of pork and wild boar meat and products thereof [2,14,15]. Likewise, in Italy, evidence from several studies indicates that the consumption of raw or undercooked pork and wild boar meat is an important risk factor [14,15]. Some studies also reported shellfish and leafy vegetables as vehicles potentially implicated in foodborne transmission of HEV [16][17][18][19]. Evidence is also available of a higher risk for professionally-exposed workers such as veterinarians, farmers and hunters [20][21][22][23].
Circulation of HEV in farmed pigs in Italy is widely documented [24][25][26][27][28]. In the food chain, HEV has been detected in pork foods such as dry and fresh sausages at retail level [29][30][31] but also in shellfish sampled in the production areas or in biomonitoring points [32,33], at retail [34] or in biomonitoring points [35]. Observational studies conducted in several countries also documented HEV contamination of vegetables and fruits [18,[36][37][38]. HEV RNA was found in sewage and surface water samples, suggesting possible environmental contamination via recycled water [33,39,40].
Understanding the dynamics and transmission of HEV from animal reservoirs to humans and of food contamination is key to reducing the incidence of hepatitis E in the population through the adoption of specific control actions in the food production chain. In recent years, several studies modeled the dynamic of HEV spread along the food chain in different EU countries at both the pre-harvest and post-harvest level, including consumer exposures [41][42][43][44][45].
In our study, we developed a mathematical model to rank the importance of various types of food potentially implicated in the transmission of HEV to humans in the adult Italian population (around 50,000,000). The food categories considered in our study are pork products with liver (PL), pork products without liver (PNL), bivalve shellfish (SH), green leafy vegetables (GV) and raw milk (RM).

Methods
In order to obtain the ranking of food items most frequently implicated in HEV transmission in Italy, we developed a parametric stochastic model to estimate the expected number of newly infected persons who develop HEV infection in the Italian population (≥18 years) through the consumption of the different foods in a one-year period.
The analyses were carried out with the R software version 3.6.0 [46]. For the heaviest calculation, the Gauss Cluster at the Turing Lab of Mathematics Department "Guido Castelnuovo" of Rome "La Sapienza" was used (http://centrocalcolo.mat.uniroma1.it, http://turinglab.mat.uniroma1.it, accessed on 1 November 2021).

The Mathematical Model
We modeled the individual infectious dose distribution S to HEV using a proxy of the infectious dose based on data available in the literature. We thus built the distribution C i of HEV concentration in a food serving of category i, based on data on prevalence of HEV contamination of food at retail obtained from a recent sampling study in Italy. Using these two quantities, we estimated the probability q i for a single person to develop a HEV infection after the consumption of a single serving of food belonging to category i. The average number of portions of each food consumed per year per person and over the number of susceptible individuals in the Italian population ( Figure 1) were then summed up to estimate the average number of newly infected cases in the Italian population in one year. All the parameters used in the model are summarized in Table 1.
The model framework is showed in Figure 1.   proportion of HEV seropositive population [50] probability of infection after consumption of one single serving T i ∼ Geom(q i ) number of failures before first successful exposure to HEV We built the distribution of concentration per serving C i for each food category using HEV load data in food expressed in genome equivalent per gram of food (g.e./gram) (see Section 2.2), and defining a mean serving size serv i (grams) for each food category, according to guidelines of Council for Agricultural Research and Analysis of the Agricultural Economy [51]. Multiplying these two quantities, we obtained the distribution of the total genome equivalent HEV per serving (g.e./serving). The distribution was built as a mixture of random variables as follow C i 0 is a Dirac delta with point mass in zero and represents the HEV negative food samples belonging to category i. The random variable C i 1 models the distribution of the viral concentration in the HEV positive samples belonging to category i and it is distributed according to an exponential distribution. The rate of the exponential distribution is the inverse of the mean viral concentration λ −1 i of HEV positive samples that we estimated using the maximum likelihood estimation (MLE). The weights α i and β i are the fraction of HEV negative and positive food samples, respectively.
To model the individual infectious dose distribution S, we used data from outbreaks for which the HEV load in implicated food (g.e./gr) was documented [13,[47][48][49]. We fitted S on the estimated foodborne HEV intake (g.e.) of cases involved in these outbreaks. The individual intake of HEV (g.e.) was estimated based on the viral concentration in the implicated food (HEV g.e./gr) for a mean serving size (gr) of the implicated food. The mean serving size (serv 0 ) was estimated according to the same data source used for serv i . We modeled S as an exponential distribution with parameter µ, which was estimated using MLE.
Any exposure to HEV, through the consumption of a single food serving, leading to an intake of HEV g.e. higher than the individual infectious threshold was defined as a new HEV infection event. We defined q i as the probability of infection given the consumption of one single food serving. Each foodborne exposure to HEV was considered independent and no cumulative exposure to HEV in multiple meals was assumed possible.
Based on the available data on food consumption in Italy, we estimated the number of average servings consumed in one year per person d i a , for each food category i. Considering each meal as a Bernoulli trial of probability q i , we built a geometrical random variable T of parameter q i modeling the number of failures before the first successful exposure (i.e., infection). We estimated the probability p i for an individual to become infected in one year as p i = P(T ≤ d i a ) meaning the probability that the first successful exposure happens within one single year. The number N of susceptible individuals in the Italian adult population was estimated subtracting to the total Italian population the fraction of HEV seropositive individuals. This latter fraction was estimated based on data from a HEV seroprevalence survey among blood donors in Italy. We assumed a long-life immunization status against HEV of seropositive subjects (i.e., no reinfection possible). Moreover, we assumed immune individuals homogeneously distributed all over the Italian population. We considered the total number of new infections per year X, given the consumption of food belonging to category i, to be binomially distributed with parameters (N, p i ). Whence, we obtained the average number of new infected individuals in one year, as the expected value of X, meaning N i in f = E(X) = N p i .

Data and Data Sources
Data on HEV prevalence and concentration in food were generated from a sampling study carried out in Italy between 2016 and 2019 within the project "CCM 2016-Hepatitis E, an emerging problem in food safety" (Suffredini, personal communication). Briefly, 730 samples were collected at retail level in different areas of Italy: North, Center and South (see Table 2). Samples were analyzed using matrix-specific viral concentration procedure for pork products [52], bivalve shellfish, green leafy vegetables (ISO, 2017) and raw milk [12]. Detection and quantification of HEV in samples was performed by real-time RT-qPCR as detailed in Di Pasquale et al. [53]. The number of individuals susceptible to HEV in the Italian population was estimated based on official demography data (as of 1 January 2021) (ISTAT (http://dati.istat.it/, accessed on 6 December 2021) and on a HEV seroprevalence study conducted among 10,011 blood donors' plasma unit samples (≥18 years) in 2018 (0.02% of the adult Italian population as of 31 December 2018) [50].
The number of food servings consumed in one year by a single person was estimated based on food consumption data sourced from a nation-wide consumption survey conducted in Italy in 2005-2006 [54], available on FAO/WHO GIFT tool platform (http://www.fao.org/gift-individual-food-consumption/en/, accessed on 1 November 2021). Details are reported in Appendix A.

Uncertainty and Sensitivity
We performed uncertainty and sensitivity analysis to quantify the variability in the output due to the variability in the input parameters [55]. We followed a sampling-based method as described in Saltelli [56,57] and generated 10,000 samples for each estimated parameter (i.e., cons i day , λ and µ) using parametric bootstrap [58,59] and ran the model obtaining a sample of the same size for the output. This output sample was used to quantify the uncertainty and to perform the sensitivity steps with the aim to explore the effect of each parameter on the model output. We followed a two-step approach. We first analyzed the scatterplots of input parameters versus the output and then calculated the standardized regression coefficient for each of the parameters by linear regression analysis. This latter step gave us the metric to rank the parameters. Details are described in Appendixes C and D.

Evidence from the Italian Surveillance System
Information on food consumption in cases of hepatitis E reported to the Italian surveillance system for acute viral hepatitis (SEIEVA) between 2016 and 2019 was used to discuss the outputs of the model. The SEIEVA is a voluntary system set up in 1985 by the Italian National Institute of Health, now covering 83% of the Italian population [60,61]. Since 2007, the local health units voluntary participating in the surveillance are required to perform and report HEV laboratory testing. Hepatitis E case definition is based on the positivity to IgM anti-HEV antibodies and elevate serum transaminases level (with or without clinical symptoms). Since the start of the SEIEVA activities, information on risk factors including food exposures were collected using a standardized questionnaire for all cases of acute viral hepatitis. The questionnaire was revised and released to include specific hepatitis E risk factors in late 2016. This activity was also completed within the national project "CCM 2016-Hepatitis E, an emerging problem in food safety".

Parameter Estimation
Food-specific parameter estimations are reported in Table 3. We also reported a 95% interval confidence for the parameters estimated directly from data.
The results of the food sampling survey indicated that the highest proportion of HEV positive samples (i.e., HEV food prevalence α i ) belongs to PL products (11%), followed by PNL (2.8%) and SH (0.48%). No positive samples were found for the GV and RM categories (0% prevalence) ( Table 3). The number of expected genome equivalents per 100 gr size of serving (i.e., λ −1 i = E(C i 1 )) for PL and PNL products were 7.8 × 10 4 g.e./serving and 3.4 × 10 4 g.e./serving, respectively. The average servings consumed per person per year was 3 and 73, respectively, for PL and PNL. The only positive sample for SH resulted in 8.7 × 10 4 g.e. per a 150 g average serving size. The expected number of SH servings consumed yearly are 26. For GV and RM categories, we did not obtain results for λ i because no positive samples were detected. These estimates are shown in Table 4, where all the parameters directly involved in the ranking activity are also reported. We included in Appendix E a risk matrix that uses these parameters to also profile a qualitative ranking of the food categories. The other parameter estimations are reported in Table 5. The average serving size consumed at outbreak events serv out resulted to be 100 gr, yielding to a mean individual infectious dose of 7.3 × 10 6 g.e. (i.e., µ −1 ). The proportion of seropositive individuals of Italian population h was derived by Spada et al. [50] and resulted to be 8.3%. We subtract this proportion to the total Italian population to obtain the amount of susceptible individuals. The total population as of 1 January 2021 was estimated to be 50,208,329, yielding to 45,840,205 susceptible individuals. All the detail on methodological aspects of parameters estimation are reported in Appendix B.

Model Output
For each food category, the outputs of each step of the model are presented in Table 6, including individual probability of infection following the consumption of a single serving q i , individual probability of infection in a year p i and the expected number of new infected per year N i in f = E(X i ). Table 6. Probability of infection following the consumption of a single serving (q i ) and in one year (p i ) and expected number of new infected individuals per year per food category in the Italian population (N in f ).
In addition, we reported density and cumulative probability sketches for new HEV infected individuals per year X PL , X PNL and X SH (Figure 2). Standard deviations of these three binomial random variables are 409 for PL, 668 for PNL and 260 for SH.

Uncertainty Analysis
We obtained bootstrap samples from input parameters and model output as described in Section 2.2.1. We considered parameters uncorrelated given the Pearson correlation test results that are reported in Appendix C, where standard deviations of input parameter samples are also shown (see Table A1).
In Table 7, we reported summary statistics of the output distribution for the three categories involved in the analysis. In Figures 3-5, we displayed the histogram and the sample cumulative distribution of the output samples for category PL, PNL and SH, respectively. The output shown for this analysis is the individual infection probability in a year p i .

Sensitivity Analysis
From uncertainty analysis, we obtained samples of input parameters and output used to perform the sensitivity analysis. The three scatterplots (Figures 6-8) show the variations of the output samples versus the three parameters involved in the analysis, underlying the strength of the dependencies between them. The shape cons i day plot is a bit flattened on the bottom with no strong structure for all the food categories, suggesting a very light dependency between the output and this parameter. λ seems to have a little more structured outline, especially for the PL category, but the output exhibits the strongest dependency with the µ parameter whose plots shows a well-defined shape, especially for PL and PNL.   To quantify the relative importance of input parameters, we conducted a regression analysis using λ i , cons i day and µ as covariates. This, as explained in [57] (Cap.1), allowed us to rank these three input parameters based on their impact on the output. Results are reported in Tables 8-10.   As already suggested by scatterplots, µ was the parameter that most influences the output for both the categories PL, PNL. The overall influence order for each category is the following: |μ| > | λ PNL | > | cons PNL day | From the value of the standardized estimates of the coefficients, we can evaluate the impact of each of them for perturbations equal to a fixed fraction of parameter's standard deviation [56] (Cap. 6). For the PL category, µ has an impact 4% higher than cons PL day and 14% higher than λ PL , while λ PL has an impact 11% higher than cons PL day . For the PNL category, µ has an impact 55% higher than λ PNL and 68% higher than cons PNL day . The impact of λ PNL is about 29% higher than cons PNL day . For the PL category, it is clear that cons PL day and λ PL are very close and all the three parameters have a relatively small impact.
The R 2 statistics for both categories are high, 0.97 for PL and 0.83 for PNL, meaning that these three parameters account for almost the entire uncertainty in the output.
We found that µ has an impact 98% higher than λ and 92% higher than cons SH day , for a perturbation equal to a fixed fraction of parameter's standard deviation. The R 2 for this parameter resulted much smaller with a 0.51, suggesting that other factors are contributing to the uncertainty. This is not surprising given that we have only one positive sample for this category.
Last, in order to have a final quantification of the stability of the output, we made a simple experiment on d a i parameter (so, indirectly on cons i day ). We increased the d a PL to actually see the results in number of exposed people per year. We observed that with d a PL = 6 the number of infected people reached 335, 446 and with d a PL = 8 N PL in f = 446, 715. More details on results of these analysis are reported in Appendix C.

Food Consumption in Italian Hepatitis E Cases
Between 2016 and 2019, a total of 213 autochthonous cases of hepatitis E were reported to the SEIEVA system. The availability of information on foods consumed by patients before the onset of the disease varied considerably among the different foods, depending on the type of questionnaire administered to hepatitis E patients. Information on shellfish consumption was available for a high proportion of cases (N = 186; 87%) because the exposure to this foodstuff was investigated through both the general questionnaire for acute viral hepatitis and the specific questionnaire for hepatitis E, in place from late 2016. For all the other food items, which were investigated with the hepatitis E questionnaire only, this proportion did not exceed the 43% of the hepatitis E cases (Table 11), making the uncertainty around the exposure prevalence to these foods much higher. Pork meat and pork cured meat were by far the food items most frequently consumed by hepatitis E cases, with more than 69% of the patients having consumed these foods, followed by pork liver, fruits, shellfish, vegetables and wild boar meat (Table 11).

Discussion
The model output shows that the consumption of PNL led to the greatest exposure to HEV in the Italian population and was associated with the highest number of new expected HEV infections per year, followed by the consumption of PL and SH. Based on these findings, the risk posed by PNL is ranked first at the population level among foods implicated in the transmission of HEV, followed by PL and SH. For the other foods considered by our study (i.e., GV and RM), no expected cases of HEV infections were estimated by our model. The consumption of pork products has been frequently indicated as a risk factor for foodborne HEV infection [62]. This type of food has also been frequently linked to foodborne outbreaks of hepatitis E [47,48,63,64]. The consumption of shellfish has also been pointed out as a possible risk factor in some studies [48,65], although, to our knowledge, no outbreaks implicating the consumption of contaminated shellfish have ever been reported in Europe.
PNL are consumed much more frequently than PL and SH and by a larger proportion of population. This explains why the highest expected number of new cases in the population is associated with this food despite the mean prevalence of HEV contamination and the viral load per serving being higher for PL and SH.
The sensitivity analysis indicates that even a slight increase in the consumption of PL servings at the individual level results in a remarkable increase in the expected number of new cases of HEV infection. As an example, passing from three to eight portions of PL consumed per person per year, which is a realistic variation at individual level, the number of new HEV infections in the population increases from about 168,000 to approximately 450,000 cases, revealing that the number of servings of PL at the individual level is a critical element to be taken into account for food risk ranking. Similar variations in the consumption of PNL do not result in comparable increases in the number of new HEV infection cases in the population. These findings provide evidence of the importance of collecting very accurate data on PL consumption at the population level in order to strengthen the HEV food risk ranking and highlight the importance of PL consumption for the risk of HEV transmission at the individual level.
The consumption of PL in Italy varies hugely among both individuals and population subgroups, depending not only on personal preferences but also on traditional differences in consumption habits. There is a wide geographic variability in the recipes and mode of preparation of pork products, with important local peculiarities, especially for products such as cured meat and offal. This may lead to highly heterogeneous consumption of PL and exposure to HEV in specific population subgroups and geographical area. Unfortunately, the food consumption data source used in our study lacks sufficient details to allow for more accurate estimations of HEV exposure associated with the different types of PL.
Food consumption data from hepatitis E cases reported to the SEIEVA (see Section 3.5) are too scarce to support a formal validation process of our model. However, they are in line with the evidence obtained in our study about the importance of both PNL and PL as top risk foods for the transmission of HEV to humans. The consumption of pork meat and pork cured meat was reported by a high proportion of cases (>70%), while pork liver as well as shellfish by a much lower proportion of cases (i.e., each food group not exceeding 30% of the cases). Although the SEIEVA data would indicate that a minor proportion of hepatitis E cases had consumed pork liver, it is necessary to consider that this food is not usually consumed as single food but it is much more frequently consumed as an ingredient in mixed pork cured meat, such as sausages, salami, mortadella, etc., which were consumed by a high proportion of cases. Unfortunately, the lack of food consumption data from healthy controls hampers drawing more specific conclusions on the magnitude of the association between PNL and PL food consumption and hepatitis E, at the individual level.
Our model was built to support risk ranking. The sensitivity analysis shows that the parameter that brings the larger uncertainty is the mean µ −1 of the HEV individual infectious dose distribution. This is not surprising given that the scarce availability of data to estimate µ affected the possibility to build a robust dose-response model, similar to many other exposure studies in humans. In our study, we used the total HEV load (g.e.) in food implicated in hepatitis E outbreaks as a proxy for the individual HEV infectious dose in humans. It is impossible, however, to evaluate to what degree the quantitative assessment of HEV in food differs from the true individual infectious dose. In addition, two different sources of uncertainty affect our dose-response model. On one hand, we only found four outbreak reports in the literature providing the information needed. On the other hand, the uncertainty associated with the quantitative methods used for the assessment of HEV in food should be also considered.
To obtain more reliable estimates of the actual number of HEV infections in humans, a more robust estimation of parameter µ would be needed. Nonetheless, the food ranking does not appear to be influenced by the uncertainty introduced by this specific parameter since the dose-response model acts in the same way on all types of food considered in our study. The parameters potentially introducing differences in the ranking are the ones displayed in Table 4. To have a more direct view of how these parameters affect the ranking, we have provided a qualitative risk classification by building a risk matrix. The analysis is reported in Appendix E.
Other important factors affecting the model outcomes are the mean quantity of food consumed in a year cons day and the mean viral concentration in food λ −1 . While the λ parameter was estimated from data from a large national sampling study, the cons day was estimated from a consumption dataset whose current reliability is difficult to assess for several reasons. First, the survey was conducted in the Italian population more than fifteen years ago and data might no longer reflect the current consumption in terms of type of food, frequency of consumption and quantity with the same level of accuracy. Second, the consumption data refer to general food categories and the lack of details on the food ingredients makes it difficult to extract the true quantities of food consumed for some categories, introducing uncertainty, in particular, for PL. Finally, estimates were only available at the national level and did not allow to take account for regional and local differences, which in the case of PL and PNL may be critical, as described above.
Another limitation of our study is the use of one single dataset for the estimation of the prevalence of HEV contamination in food. This methodological choice was driven by the lack of full comparability of the prevalence estimates among different studies due to a poor harmonization of laboratory methods used to detect HEV in food. In addition, the uncertainty analysis that we performed in our study would have been impossible using estimates from other studies. Due to the extremely high variety of foods and mode of preparation, estimating the prevalence of HEV contamination in food based on one single survey may introduce a selection bias depending on the goodness of randomization of the samples. In our study, this type of bias may be suspected for GV and SH prevalence esti-mation. For both these foodstuffs the estimated prevalence was 0.5% and 0%, respectively, representing a highly discrepant value compared with other similar studies conducted in Italy and abroad. In SH, Suffredini et al. and La Rosa et al. [32,33] reported much higher prevalence of HEV contamination with values ranging up to 8.1%. The prevalence of GV contamination, although considerably lower than SH, was never estimated to be 0% in other studies [18,37,38]. These considerations suggest that the role of SH and GV for HEV transmission in the Italian population may be more important than our study showed. In terms of risk ranking, however, this does not appear to substantially change our results. Different is the case of RM. This food was included in our study because it was focused as a potential risk food for HEV transmission in China in 2016, where a high prevalence of active HEV infection in cows was reported [66]. However, no further studies confirmed these findings [67][68][69].

Conclusions
Our model allowed to rank the relevance of different food categories for HEV transmission in the Italian population. In agreement with the literature and with data from Italian surveillance, pork products with and without liver emerged as the most important food implicated in HEV transmission. In particular, since our data and analyses highlight a specific risk associated with liver-containing products, actions to reduce the risk could leverage on the reduction of HEV contamination of the final products, for example, by screening livers destined to make up seasoned cured meat products or by reducing the viral load through heat treatment (e.g. pasteurization) of tissues and livers allocated to food production. Moreover, consumer information on the risk posed by liver consumption represents an important component of the mitigation strategies, in particular, in areas where the consumption of liver-containing traditional products is popular.
Another important value of our study is its contribution in shedding light into the existing data gaps that the research and surveillance activities should cover in the future to improve the risk ranking. From this perspective, the availability of a mathematical model represents a useful tool that could be further refined as new data and knowledge will be made available.

Funding:
The study has received funding from the Italian Ministry of Health in the framework of the national project "Hepatitis E, an emerging problem in food safety: 'One Health' approach for risk assessment" (CCM 2016 program).

Data Availability Statement:
Unpublished data on viral presence in food used are not publicly available. All the other datasets used are referenced in Section 2.2. R code used for main estimation can be found here: https://github.com/Mezzanenne/Risk_ranking_Script, accessed on 1 November 2021.

Abbreviations
The following abbreviations are used in this manuscript:

Appendix A. Consumption Data
Observations from the consumption survey database [54] were filtered from the survey data to select the products belonging to each food category using the "ingredient" variable. For RM and PL, it was not possible to filter directly from data. For the former, we assumed that all the dairy cow milk consumed was raw, choosing a worst-case scenario. For the latter, data were not available for all the processed and/or seasoned food items. Therefore, we assumed that 20% of all pork sausages could contain liver.

Appendix B.1. Maximum Likelihood Estimation
For the parameter estimated from data, we used the maximum likelihood estimation technique. We report here the estimator definition as in [58] (Par. 9.3).
The maximum likelihood estimation is one of the most common methods for estimating parameters in a parametric model. Let X 1 , ·, X n be independent and identically distributed samples from a random variable with probability density function f (x; θ) depending on an unknown parameter θ. Definition A1. The likelihood function is defined by The maximum likelihood estimator (MLE), denoted byθ n , is the value of θ that maximizes L n (θ), meaningθ So, for our purposes, X 1 , . . . , X n are the data values that we supposed coming from a given probability distribution. In particular, we supposed that all the data we had came from an exponential distribution. Let us define what an exponential distribution is.
Definition A2. A continuous random variable whose probability density function is given, for some λ > 0, by is said to be exponential random variable with parameter (or rate) λ.
It is possible to prove the following result: Proposition A1. If X 1 , . . . , X n are independent exponential random variables each having mean θ, the maximum likelihood estimator of θ is the sample mean We then estimated the rate of the exponential distribution λ simply as the reciprocal 1 θ of the sample mean.

Appendix B.2. Confidence Interval
As we supposed all the data coming from exponential distributions and given the use of the MLE for the parameter estimation, we refer to [70] (Par. 5.7, Par. 7.6) to build confidence intervals. We need to introduce some results. No proofs will be presented here, the entire development is available in [70].
We begin defining the gamma distribution.
Remark A1. It is straightforward to see, given the two previous definitions, that when α = 1 the gamma distribution reduces to the exponential with mean 1 λ (recalling that 0! = 1.).
Proposition A2. If X i , i = 1, . . . , n are independent gamma random variables with respective parameters (α i , λ), then ∑ n i=1 X i is gamma with parameters ∑ n i=1 α i , λ From Remark A1 and Proposition A2 follows the next result: Corollary A1. If X 1 , . . . , X n are independent exponential random variables, each having rate λ, then ∑ n i=1 X i is a gamma random variable with parameters (n, λ).
The last type of random variable needed here is the chi squared Definition A4. If Z 1 , . . . , Z n are independent standard normal random variables, then X, defined by X = Z 2 1 + Z 2 2 + · · · + Z 2 n is said to have a chi-square distribution with n degree of freedom. We will use the following notation X ∼ χ 2 n .
A property that we are interested in for this kind of random variable is the following Lemma A1. If X ∼ χ 2 n , X is identical to a Gamma( n 2 , 1 2 ).
From this last result, we obtain that 2 θ ∑ n i=1 X i ∼ χ 2 2n . Now, we recall that if X ∼ χ 2 n we define the quantity χ 2 α,n to be such that An illustration of this value is shown in Figure A1 from [70] (Figure 5.11). Figure A1. The chi-square density function with eight degrees of freedom.
Given the previous results and remarks, for any α ∈ (0, 1), we have Appendix C. Uncertainty Analysis Appendix C.1. Standard Deviations of Input Parameters

. Parameter Correlations
We report the correlation test results and the scatterplots between them for the three categories involved in the analysis, meaning PL, PNL and SH.  Figure A2. Scatterplots paired between input parameters for category PL.  We can see from tables and plots that some of the parameters show a certain structure in plots, especially for the PL category, even if the numbers seem rather small. However, we considered parameter uncorrelated given the small correlation coefficients.

Appendix C.3. Parametric Bootstrap
We report here a brief explanation of basics on parametric bootstrap theory, modified from [59] (Cap. 6, 21).
Suppose we have x = [x 1 , . . . , x n ] a random sample from a probability distribution F and that we are interested in some unknown parameter θ = t(F), on the basis of (x). A good way to estimate the sampling distribution and variance ofθ = MLE(θ) when we know F is to use parametric bootstrap, which consists of two main steps.

•
We draw m samples of size n (same size of original sample) from the parametric model density fθ(y), using the estimationθ obtained with maximum likelihood estimation. • We calculate for each sample the maximum likelihood estimation of θ. This way we obtain a m size sample ofθ For more detail on the topic, see the reference cited above.

Appendix D. Linear Regression
Appendix D.1. Introduction We report here some basics on regression analysis in the context of sensitivity analysis, modified from [56] (Par. 6.6.2, 6.6.3). Performing the sampling-based method for sensitivity analysis, we generated a m sized sample for each of the n input variables and, consequently, for the output. So we want to investigate the map where, for each k x k = [x k1 , . . . , x kn ] represents the k-th observation of the sample of the input variables and y(·) is our model. In the linear regression framework, a model of the formŷ is developed from the mapping between analysis input and analysis results, where the x j are the input variables under consideration and the b j are coefficients that must be determined. The coefficients b j can be used to indicate the importance of the individual x j with respect to the uncertainty in y.
We start from the map showed in (A1), whence there exists a sequence y k = y(x k ), k = 1, . . . , m of values for the output variable, that is the output sample obtained. When expressed in the form of the Equation (A2), each y k becomes where the error terms k , k = 1, . . . , m are defined by k = y k −ŷ k and thus equal the difference between the observed value y k and the corresponding predicted valueŷ k defined by (A2). To determine the b j , the method of least squares is the most widely used. We introduce here the matrix representation for the equality (A3): In the least-squares approach, the b j are determined such that the sum is a minimum, meaning such that the squared error terms is a minimum. After some calculation, provided that the matric x T x is invertible(it is usually the case in samplingbased study in which the number of sample elements (i.e., m) exceeds the number of independent variables (i.e., n).), we have a unique value for b given by: The coefficients b jŝj /ŝ appearing in (A4) are called standardized regression coefficients (SRCs). When the x j are independent, the absolute value of the SRCs can be used to provide a measure of variable importance. Specifically, the coefficients provide a measure of importance based on the effect moving each variable away from its expected value by a fixed fraction of its standard deviation while retaining all other variables at their expected values. Calculating SRCs is equivalent to performing the regression analysis with the input and output variables normalized to mean zero and standard deviation one.

Appendix D.2. Parameter Impact
As explained above, we can use the SRCs to rank the parameters used in the analysis based on their effect on the output. Our linear regression with SRCs resulted in the following expression for the two categories: y PL = −0.864 λ PL + 0.865 cons PL day + 0.99 µ (A5) y PNL = −0.9 λ PNL + 0.88 cons PNL day + 0.995 µ (A6)

Appendix E. Risk Matrix
We report here a risk matrix to qualitatively score the risk of each food category, based on the severity and likelihood of food contamination [71]. We are here considering the parameters reported in Table 4.
We defined the severity as the expected HEV genome equivalents per serving E(C i 1 ) = λ −1 i , transformed in log e . Severity scores definition are reported in Table A5. The likelihood is defined as the fraction of contaminated servings consumed in one year. Meaning the number of servings consumed in a year d a i times the prevalence of HEV contaminated food samples α i . Likelihood scores are displayed in Table A5. Based on the definitions and on the data sheet reported in Table A6, we can calculate the risk for each of the food categories as risk = likelihood x severity (see Table A6) and assign to each category a qualitative risk level, as illustrated in Figure A5.